Eviction Strategies for Semantic Flow Processing

نویسندگان

  • Minh Khoa Nguyen
  • Thomas Scharrenbach
  • Abraham Bernstein
چکیده

In order to cope with the ever-increasing data volume continuous processing of incoming data via Semantic Flow Processing systems have been proposed. These systems allow to answer queries on streams of RDF triples. To achieve this goal they match (triple) patterns against the incoming stream and generate/update variable bindings. Yet, given the continuous nature of the stream the number of bindings can explode and exceed memory; in particular when computing aggregates. To make the information processing practical Semantic Flow Processing systems, therefore, typically limit the considered data to a (moving) window. Whilst this technique is simple it may not be able to find patterns spread further than the window or may still cause memory overruns when data is highly bursty. In this paper we propose to maintain bindings (and thus memory) not on recency (i.e., a window) but on the likelihood of contributing to a complete match. We propose to base the decision on the matching likelihood and not creation time (fifo) or at random. Furthermore we propose to drop variable bindings instead of data as do load shedding approaches. Specifically, we systematically investigate deterministic and the matching-likelihood based probabilistic eviction strategy for dropping variable bindings in terms of recall. We find that a matching likelihood based eviction can outperform fifo and random eviction strategies on synthetic as well as real world data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The CLOCK Data-Aware Eviction Approach: Towards Processing Linked Data Streams with Limited Resources

Processing streams rather than static files of Linked Data has gained increasing importance in the web of data. When processing data streams system builders are faced with the conundrum of guaranteeing a constant maximum response time with limited resources and, possibly, no prior information on the data arrival frequency. One approach to address this issue is to delete data from a cache during...

متن کامل

Paging for Multicore (CMP) Caches

In the last few years, multicore processors have become the dominant processor architec-ture. While cache eviction policies have been widely studied both in theory and practice forsequential processors, in the case in which various simultaneous processes use a shared cachethe performance of even the most common eviction policies is not yet fully understood, nor dowe know if curr...

متن کامل

Developing a Semantic Similarity Judgment Test for Persian Action Verbs and Non-action Nouns in Patients With Brain Injury and Determining its Content Validity

Objective: Brain trauma evidences suggest that the two grammatical categories of noun and verb are processed in different regions of the brain due to differences in the complexity of grammatical and semantic information processing. Studies have shown that the verbs belonging to different semantic categories lead to neural activity in different areas of the brain, and action verb processing is r...

متن کامل

Semantic Processing Ability in Persian-Speaking Alzheimer’s Patients

Objectives: This paper aims to explore whether the Persian-speaking patients of different stages, ranging from mild to moderate, have a deficit in semantic processing by comparing the performance of Alzheimer’s patients with that of the healthy individuals. Methods: The subjects of both the groups of Alzheimer’s patients and healthy control were matched for age, the state of monoli...

متن کامل

Towards A Cache-Enabled, Order-Aware, Ontology-Based Stream Reasoning Framework

While streaming data have become increasingly more popular in business and research communities, semantic models and processing software for streaming data have not kept pace. Traditional semantic solutions have not addressed transient data streams. Semantic web languages (e.g., RDF, OWL) have typically addressed static data settings and linked data approaches have predominantly addressed stati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013